Achieving performance under OpenMP on ccNUMA and software distributed shared memory systems
Identifieur interne : 000252 ( Main/Exploration ); précédent : 000251; suivant : 000253Achieving performance under OpenMP on ccNUMA and software distributed shared memory systems
Auteurs : B. Chapman [États-Unis] ; F. Bregier [États-Unis] ; A. Patil [États-Unis] ; A. Prabhakar [États-Unis]Source :
- Concurrency and Computation: Practice and Experience [ 1532-0626 ] ; 2002-07.
English descriptors
- KwdEn :
Abstract
OpenMP is emerging as a viable high‐level programming model for shared memory parallel systems. It was conceived to enable easy, portable application development on this range of systems, and it has also been implemented on cache‐coherent Non‐Uniform Memory Access (ccNUMA) architectures. Unfortunately, it is hard to obtain high performance on the latter architecture, particularly when large numbers of threads are involved. In this paper, we discuss the difficulties faced when writing OpenMP programs for ccNUMA systems, and explain how the vendors have attempted to overcome them. We focus on one such system, the SGI Origin 2000, and perform a variety of experiments designed to illustrate the impact of the vendor's efforts. We compare codes written in a standard, loop‐level parallel style under OpenMP with alternative versions written in a Single Program Multiple Data (SPMD) fashion, also realized via OpenMP, and show that the latter consistently provides superior performance. A carefully chosen set of language extensions can help us translate programs from the former style to the latter (or to compile directly, but in a similar manner). Syntax for these extensions can be borrowed from HPF, and some aspects of HPF compiler technology can help the translation process. It is our expectation that an extended language, if well compiled, would improve the attractiveness of OpenMP as a language for high‐performance computation on an important class of modern architectures. Copyright © 2002 John Wiley & Sons, Ltd.
Url:
DOI: 10.1002/cpe.646
Affiliations:
Links toward previous steps (curation, corpus...)
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Achieving performance under OpenMP on ccNUMA and software distributed shared memory systems</title>
<author><name sortKey="Chapman, B" sort="Chapman, B" uniqKey="Chapman B" first="B." last="Chapman">B. Chapman</name>
</author>
<author><name sortKey="Bregier, F" sort="Bregier, F" uniqKey="Bregier F" first="F." last="Bregier">F. Bregier</name>
</author>
<author><name sortKey="Patil, A" sort="Patil, A" uniqKey="Patil A" first="A." last="Patil">A. Patil</name>
</author>
<author><name sortKey="Prabhakar, A" sort="Prabhakar, A" uniqKey="Prabhakar A" first="A." last="Prabhakar">A. Prabhakar</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:970529853F7F494E7C9F07FE183606ADC17CACF9</idno>
<date when="2002" year="2002">2002</date>
<idno type="doi">10.1002/cpe.646</idno>
<idno type="url">https://api.istex.fr/document/970529853F7F494E7C9F07FE183606ADC17CACF9/fulltext/pdf</idno>
<idno type="wicri:Area/Main/Corpus">000C31</idno>
<idno type="wicri:explorRef" wicri:stream="Main" wicri:step="Corpus" wicri:corpus="ISTEX">000C31</idno>
<idno type="wicri:Area/Main/Curation">000C31</idno>
<idno type="wicri:Area/Main/Exploration">000252</idno>
<idno type="wicri:explorRef" wicri:stream="Main" wicri:step="Exploration">000252</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Achieving performance under OpenMP on ccNUMA and software distributed shared memory systems</title>
<author><name sortKey="Chapman, B" sort="Chapman, B" uniqKey="Chapman B" first="B." last="Chapman">B. Chapman</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science, University of Houston, Houston, TX 77204‐3010</wicri:regionArea>
<placeName><region type="state">Texas</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Bregier, F" sort="Bregier, F" uniqKey="Bregier F" first="F." last="Bregier">F. Bregier</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science, University of Houston, Houston, TX 77204‐3010</wicri:regionArea>
<placeName><region type="state">Texas</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Patil, A" sort="Patil, A" uniqKey="Patil A" first="A." last="Patil">A. Patil</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science, University of Houston, Houston, TX 77204‐3010</wicri:regionArea>
<placeName><region type="state">Texas</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Prabhakar, A" sort="Prabhakar, A" uniqKey="Prabhakar A" first="A." last="Prabhakar">A. Prabhakar</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science, University of Houston, Houston, TX 77204‐3010</wicri:regionArea>
<placeName><region type="state">Texas</region>
</placeName>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="j">Concurrency and Computation: Practice and Experience</title>
<title level="j" type="abbrev">Concurrency Computat.: Pract. Exper.</title>
<idno type="ISSN">1532-0626</idno>
<idno type="eISSN">1532-0634</idno>
<imprint><publisher>John Wiley & Sons, Ltd.</publisher>
<pubPlace>Chichester, UK</pubPlace>
<date type="published" when="2002-07">2002-07</date>
<biblScope unit="volume">14</biblScope>
<biblScope unit="issue">8‐9</biblScope>
<biblScope unit="page" from="713">713</biblScope>
<biblScope unit="page" to="739">739</biblScope>
</imprint>
<idno type="ISSN">1532-0626</idno>
</series>
<idno type="istex">970529853F7F494E7C9F07FE183606ADC17CACF9</idno>
<idno type="DOI">10.1002/cpe.646</idno>
<idno type="ArticleID">CPE646</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">1532-0626</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>OpenMP</term>
<term>ccNUMA architectures</term>
<term>data distribution</term>
<term>data locality</term>
<term>restructuring</term>
<term>shared memory parallel programming</term>
<term>software distributed shared memory</term>
</keywords>
</textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">OpenMP is emerging as a viable high‐level programming model for shared memory parallel systems. It was conceived to enable easy, portable application development on this range of systems, and it has also been implemented on cache‐coherent Non‐Uniform Memory Access (ccNUMA) architectures. Unfortunately, it is hard to obtain high performance on the latter architecture, particularly when large numbers of threads are involved. In this paper, we discuss the difficulties faced when writing OpenMP programs for ccNUMA systems, and explain how the vendors have attempted to overcome them. We focus on one such system, the SGI Origin 2000, and perform a variety of experiments designed to illustrate the impact of the vendor's efforts. We compare codes written in a standard, loop‐level parallel style under OpenMP with alternative versions written in a Single Program Multiple Data (SPMD) fashion, also realized via OpenMP, and show that the latter consistently provides superior performance. A carefully chosen set of language extensions can help us translate programs from the former style to the latter (or to compile directly, but in a similar manner). Syntax for these extensions can be borrowed from HPF, and some aspects of HPF compiler technology can help the translation process. It is our expectation that an extended language, if well compiled, would improve the attractiveness of OpenMP as a language for high‐performance computation on an important class of modern architectures. Copyright © 2002 John Wiley & Sons, Ltd.</div>
</front>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>Texas</li>
</region>
</list>
<tree><country name="États-Unis"><region name="Texas"><name sortKey="Chapman, B" sort="Chapman, B" uniqKey="Chapman B" first="B." last="Chapman">B. Chapman</name>
</region>
<name sortKey="Bregier, F" sort="Bregier, F" uniqKey="Bregier F" first="F." last="Bregier">F. Bregier</name>
<name sortKey="Patil, A" sort="Patil, A" uniqKey="Patil A" first="A." last="Patil">A. Patil</name>
<name sortKey="Prabhakar, A" sort="Prabhakar, A" uniqKey="Prabhakar A" first="A." last="Prabhakar">A. Prabhakar</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/France/explor/AussoisV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000252 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000252 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/France |area= AussoisV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:970529853F7F494E7C9F07FE183606ADC17CACF9 |texte= Achieving performance under OpenMP on ccNUMA and software distributed shared memory systems }}
This area was generated with Dilib version V0.6.29. |